Bermuda, a data-driven tool for phonetic transcription of words

نویسندگان

  • Tiberiu Boroș
  • Dan Ștefănescu
  • Radu Ion
چکیده

The article presents the Bermuda component of the NLPUF text-to-speech toolbox. Bermuda performs phonetic transcription for out-of-vocabulary words using a Maximum Entropy classifier and a custom designed algorithm named DLOPS. It offers direct transcription by using either one of the two available algorithms, or it can chain either algorithm to a second layer Maximum Entropy classifier designed to correct the first-layer transcription errors. Bermuda can be used outside of the NLPUF package by itself or to improve performance of other modular text-to-speech packages. The training steps are presented, the process of transcription is exemplified and an initial evaluation is performed. The article closes with usage examples of Bermuda.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Automaton-based Machine for Automatic Phonetic

In this work, we present an innovative approach for grapheme to phoneme conversion, which achieves very low error rates for languages like British English, American English and Dutch, and gives good generalization performances. One of the basic steps in the text-to-speech conversion performed by the speech synthesis systems is the phonetic transcription of the input text that can be considered ...

متن کامل

Automatic Phonetic Transcription by Phonological Derivation

Automatic phonetic transcription tools usually perform phonetic transcriptions directly from orthographic representations. Although these approaches often achieve good results, theoretical studies suggest that including morphophonological knowledge allows those systems to improve their performance. Following this idea, we developed a tool which first obtains an underlying representation of each...

متن کامل

Soft-computing Methods for Text-to-Speech Driven Avatars

This paper presents a new approach for driving avatars with text-to-speech synthesis that uses pure text as an information source. The goal is to move lips and face muscles on the basis of the phonetic nature of the utterance and the related expression. Several methods came together to define this solution. Rule-based text-to-speech synthesis generates phonetic and expression transcription of t...

متن کامل

Lexicon Adaptation for Broadcast News Transcription

This paper presents a technique for dynamically extending the language model lexicon of an Italian broadcast news transcription system. New words are selected dayby-day, from contemporary news available on the Internet, according to a strategy that tries to minimize the out-of-vocabulary rate of the language model. Phonetic transcriptions of new words are generated automatically with an in-hous...

متن کامل

Multigram-based grapheme-to-phoneme conversion for LVCSR

Many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, spoken document retrieval, . . . ) Recognition of (new) words requires acoustic baseforms for them to be known. Commonly words are transcribed manually, which poses a major burden on vocabulary adaptation and interdomain portability. In this work we investigate the possib...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012